Limitations of scaling and universality in stock market data 
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We present evidence, that if a large enough set of high resolution stock market data is analyzed, 
certain analogies with physics - such as scaling and universality - fail to capture the full complexity 
of such data. Despite earlier expectations, the mean value per trade, the mean number of trades 
per minute and the mean trading activity do not show scaling with company capitalization, there is 
only a non-trivial monotonous dependence. The strength of correlations present in the time series 
of traded value is found to be non-universal: The Hurst exponent increases logarithmically with 
capitalization. A similar trend is displayed by intertrade time intervals. This is a clear indication 
that stylized facts need not be fully universal, but can instead have a well-defined dependence on 
company size. 



In the last decade, an increasing number of physicists is 
becoming devoted to the study of economic and financial 
phenomena 0,0,11]. One of the reasons for this tendency 
is that societies or stock markets can be seen as strongly 
interacting systems. Since the early 70's, physics has 
developed a wide range of concepts and models to effi- 
ciently treat such topics, these include (fractal and mul- 
tifractal) scaling, frustrated disordered systems, and far 
from equilibrium phenomena. To understand how simi- 
larly complex patterns arise from human activity, albeit 
truly challenging, seems a natural continuation of such 
efforts. 

While a remarkable success has been achieved QIEEI> 
studies in econophysics are often rooted in possible analo- 
gies, even though there are important differences between 
physical and financial systems. Despite the obvious sim- 
ilarities to interacting systems here we would like to em- 
phasize the discrepancy in the levels of description. For 
example, in the case of a physical system undergoing a 
second order phase transition, it is natural to assume 
scaling on profound theoretical grounds and the (experi- 
mental or theoretical) determination of, e.g., the critical 
exponents is a fully justified undertaking. There is no 
similar theoretical basis for the financial market what- 
soever, therefore in this case the assumption of power 
laws should be considered only as one possible way of fit- 
ting fat tailed distributions 0,|g- Also, the reference to 
universality should not be plausible as the robustness of 
qualitative features - like the fat tail of the distributions 
- is a much weaker property. While we fully acknowl- 
edge the process of understanding based on analogies as 
an important method of scientific progress, we empha- 
size that special care has to be taken in cases where the 
theoretical support is sparse. 

The aim of this paper is to summarize some recent 
advances that help to understand these fundamental dif- 
ferences. We present evidence, that the size of companies 
strongly affects the characteristics of trading activity of 



their stocks, in a way which is incompatible with the 
popular assumption of universality in trading dynamics. 
Instead, certain stylized facts have a well-defined depen- 
dence on company capitalization. Therefore, e.g., av- 
eraging distributions over companies with very different 
capitalization is questionable. 

The paper is organized as follows. Section[I]introduces 
the notations and data that were used. Section ITU shows 
that various measures of trading activity depend on capi- 
talization in a non-trivial way. In Sec. IIHI we analyze the 
correlations present in traded value time series, and find 
that the Hurst exponent increases with the mean traded 
value per minute logarithmically. Section If V I deals with 
a similar size-dependence of correlations present in the 
time intervals between trades. Finally, Section [V] con- 
cludes. 



I. NOTATIONS AND DATA 

For time windows of size At, let us write the total 
traded value (activity, flow) of the ith stock at time t as 



E 

n,ti(n)£[t,t+At] 



Vi(n), 



(1) 



where U(n) is the time of the n-th transaction of the 
i-th stock. This corresponds to the coarse-graining of 
the individual events, or the so-called tick-by-tick data. 
Vi(n) is the value traded in transaction n, and it can be 
calculated as the product of the price p and the traded 
volume of stocks V, 



Vi(n) = Pi{n)Vi(n). 



(2) 
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Price does not change very much from trade to trade, 
so the dominant factor in the fluctuations and the sta- 
tistical properties of / is given by the variation of the 
number of stocks exchanged in the transactions, V. Price 
serves as a conversion factor to a common unit (US dol- 
lars), and it makes the comparison of stocks possible, 
while also automatically corrects the data for stock splits. 
The statistical properties (normalized distribution, corre- 
lations, etc.) are otherwise practically indistinguishable 
between traded volume and traded value. 
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We used empirical data from the TAQ database 
which records all transactions of the New York Stock Ex- 
change and NASDAQ for the years 1993 - 2003. 

Finally, note that throughout the paper we use 10-base 
logarithms. 



II. CAPITALIZATION AFFECTS BASIC 
MEASURES OF TRADING ACTIVITY 

Most previous studies are restricted to an analysis of 
the stocks of large companies. These are traded fre- 
quently, and so price and returns are well defined even on 
the time scale of a few seconds. Nevertheless, other quan- 
tities regarding the activity of trading, such as traded 
value and volume or the number of trades can be defined, 
even for those stocks where they are zero for most of the 
time. In this section we extend the study of Zumbach 
[Tfl | which concerned the 100 large companies included in 
London Stock Exchange's FTSE-100 market index. This 
set spans about two orders of magnitude in capitaliza- 
tion. Instead, we analyze the 3347 stocks 28] that were 
traded continuously at NYSE for the year 2000. This 
gives us a substantially larger range of capitalization, ap- 
proximately 10 6 . . . 6 • 10 11 USD. 

Following Ref . ^} > m order to quantify how the value 
of the capitalization C$ of a company is reflected in the 
trading activity of its stock, we plotted the mean value 
per trade (Vi), mean number of trades per minute (Ni) 
and mean activity (traded value per minute) (/j) versus 
capitalization in Fig. ^ Ref- iJJJl found that all three 
quantities have power law dependence on Ci, however, 
this simple ansatz does not seem to work for our extended 
range of stocks. While mean trading activity can be - to 
a reasonable quality - approximated as (/*) oc (j°' 98±om , 
neither (V) nor (N) can be fitted by a single power law 
in the whole range of capitalization. Nevertheless, there 
is an unsurprising monotonous dependence: higher capi- 
talized stocks are traded more intensively. 

One can gain further insight from Fig. ^d), which 
eliminates the capitalization variable, and shows (V) ver- 
sus (N). For the largest 1600 stocks we find the scaling 
relation 

(V t ) oc (Ni) 13 , (3) 

with (3 = 0.57 ± 0.09. The estimate based on the results 
of Zumbach [l(]] for the stocks in London's FTSE-100, is 
/3 w 1, while Ref. [ll| finds [3 = 0.22±0.04 for NASDAQ. 
The regime of smaller stocks shows no clear tendency. 

One possible interpretation of the effect is the follow- 
ing. Smaller stocks are exchanged rarely, but there must 
exist a smallest exchanged value that is still profitable 
to use due to transaction costs, (V) cannot decrease in- 
definitely. On the other hand, once a stock is exchanged 
more often (the change happens at about (N) = 0.05 
trades/min), it is no more traded in this minimal prof- 
itable unit. With more intensive trading, trades "stick 



together", liquidity allows the exchange of larger pack- 
ages. This increase is clear, but not very large, up to one 
order of magnitude. Although increa sing package s izes 
reduce transaction costs, price impact [lj, 021 UM m ~ 
creases, and profits will decrease again. The balance be- 
tween these two effects can determine package sizes and 
may play a role in the formation of J3J|. 

III. NON-UNIVERSAL CORRELATIONS OF 
TRADED VALUE 

Scaling methods 0, 0, have long been used to 
characterize stock market time series, including prices 
and trading volumes 0,0- In particular, the Hurst ex- 
ponent H(i) is often calculated. For the traded value 
time series /^(t) of stock i, it can be defined as 

af(At) = ((/*(t) - (/^(t))) 2 ) cx At^\ (4) 

where (•} denotes time averaging with respect to t. The 
signal is said to be correlated (persistent) when H > 0.5, 
uncorrelated when H = 0.5, and anticorrelated (antiper- 
sistent) for H < 0.5. It is not a trivial fact, but several 
recent papers 0, |2(j point out that the variance on the 
left hand side exists for any stock's traded value and any 
time scale At. Therefore, we carried out measurements 
of H on all 2647 stocks that were continuously traded on 
NYSE in the period 2000 - 2002. We investigated sepa- 
rately the 4039 stocks that were traded at NASDAQ for 
the same period. 

We find, that stock market activity has a much richer 
behavior, than simply all stocks having Hurst expo- 
nents statistically distributed around an average value, 
as assumed in Ref. [2l]. Instead, there is a crossover 
[T^. I22I l23| between two types of behavior around the 
time scale of a few hours to 1 trading day. An essentially 
uncorrelated regime was found when At < 20 min for 
NYSE and At < 2 min for NASDAQ, while the time se- 
ries of larger companies become strongly correlated when 
At > 300 min for NYSE and At > 60 min for NASDAQ. 
As a reference, we also calculated the Hurst exponents 
Hshuff(i) of the shuffled time series. The results are 
plotted in Fig. |2 

One can see, that for shorter time windows, correla- 
tions are absent in both markets, H(i) rs 0.51 — 0.53. 
For windows longer than a trading day, however, while 
small (/) stocks again display only very weak correla- 
tions, larger ones show up to H rj 0.9. Furthermore, 
there is a distinct logarithmic trend in the data: 

H(i) = H*+jlog(fi), (5) 

with j(At > 300mm) = 0.06 ± 0.01 for NYSE and 
j(At > 60mm) = 0.05 ± 0.01 for NASDAQ. This re- 
sult can be predicted by a g eneral framework based on a 
new type of scaling law [TJ |24j . Shorter time scales cor- 
respond to the special case 7 = 0, there is no systematic 
trend in H . After shuffling the time series, as expected, 
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Figure 1: (a)-(c) Capitalization dependence of certain measures of trading activity in the year 2000. The functions are 
monotonously increasing and can be piecewise approximated by power laws as indicated. All three tendencies break down for 
large capitalizations, (a) Mean value per trade (V) in USD. The fitted slope corresponds to the regime 5- 10 7 < C < 7.5- 10 10 
in USD. (b) Mean number of trades per minute (N). The slope on the left is from a fit to C < 4.5 • 10 9 USD, while the one on 
the right is for C > 4.5 • 10 9 USD. (c) Mean trading activity (exchanged value per minute) (/) in USD. The plots include 3347 
stocks that were continuously available at NYSE during 2000. (d) Plot of mean value per trade (V) versus mean number of 
trades per minute (N) for the year 2000 of NYSE. For smaller stocks there is no clear tendency. For the top ~ 1600 companies 
((N) > 0.05 trades/min), however, there is scaling with an exponent /3 = 0.57 ± 0.08. 



they become uncorrelated and show H s huff(i) ~ 0.5 at 
all time scales and without significant dependence on (/»). 

It is to be emphasized, that the crossover is not simply 
between uncorrelated and correlated regimes. It is in- 
stead between homogeneous (all stocks show H(i) « Hi, 
7 = 0) and inhomogeneous (7 > 0) behavior. One finds 
Hi « 0.5, but very small (/) stocks do not depart much 
from this value even for large time windows. This is a 
clear relation to company size, as (/) is a monotonously 
growing function of company capitalization (see Sec. ^ 
andRef. HU). 

Dependence of the effect on (/) is in fact a depen- 
dence on company size. This is a direct evidence of 
non-universality. The trading mechanism that governs 
the marketplace depends strongly on the stock that is 
traded. In a physical sense, there are no universality 
classes |25j comprising a given group of stocks and char- 
acterized by a set of stylized facts, such as Hurst expo- 
nents. Instead, there is a continuous spectrum of com- 
pany sizes and the stylized facts may depend continuously 
on company size/capitalization. 



Systematic dependence of the exponent of the power 
spectrum of the number of trades on capitalization was 
previously reported in Ref. |2(|. based on the study of 
88 stocks. That quantity is closely related to the Hurst 
exponent of the respective time series (see Ref. |22|). 
Direct analysis finds a strong, monotonous increase of 
the Hurst exponent of N with growing (N), but no such 
clear logarithmic trend as Eq. JSJ. 



IV. NON-UNIVERSAL CORRELATIONS OF 
INTERTRADE TIMES 

To strengthen the arguments of Sec. 11111 we carried 
out a a similar analysis of the intertrade interval series 
Tj(n = 1 . . .Ni — 1), defined as the time spacings between 
the n'th and n + l'th trade. 2Vj is the total number of 
trades for stock i during the period under study. 

Previously, Ref. [22| used 30 stocks from the TAQ 
database for the period 1993 — 1996 and proposed that 
Ht has the universal value 0.94 ± 0.05. 
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Figure 2: Behavior of the Hurst exponents H(i) for the period 
2000 - 2002, and two markets ((a) NYSE, (b) NASDAQ). 
For short time windows (O), all signals are nearly uncorre- 
cted, H(i) ~ 0.51 — 0.52, regardless of stock market. The 
fitted slopes are ^ NY SE(At < 20min) = 0.001 ± 0.002, and 
~fNASDAQ{At < 2min) = 0.003 ± 0.002. For larger time win- 
dows (■), the strength of correlations depends logarithmi- 
cally on the mean trading activity of the stock, ^NYSE^t > 
300min) = 0.06 ± 0.01 and -j N ASDAQ(At > 60min) = 
0.05 ± 0.01. Shuffled data (v) display no correlations, thus 
H s huff(i) = 0.5. Insets: The log<r-logAt scaling plots (■) 
for two example stocks, GE (NYSE) and DELL (NASDAQ). 
The darker shaded intervals have well-defined Hurst expo- 
nents, the crossover is indicated with a lighter background. 
Results for shuffled time series (O) were shifted vertically for 
better visibility. 

We analyzed the same database, but included a large 
number of stocks with very different capitalizations. First 
it has to be noted that the mean intertrade interval has 
decreased drastically over the years. In this sense the 
stock market cannot be considered stationary for periods 
much longer than one year. We analyzed the two year 
period 1994 - 1995 (part of that used in Ref. ^) and 



separately the single year 2000. We used all stocks in the 
TAQ database with (T) < 10 5 sec, a total of 3924 and 
4044 stocks, respectively. 

The Hurst exponents for the time series Ti can written, 
analogously to Eq. 10} , as 

-UN) = ^ (|>(n) - (J>W)) ) « N2HTil) > 

where the series is not defined in time, but instead on a 
tick-by-tick basis, indexed by the number of transactions. 

The data show a crossover, similar to that for the 
traded value /, from a lower to a higher value of Ht{i) 
when the window size is approximately the daily mean 
number of trades (for an example, see the inset of Fig. 
|3J. For the restricted set studied in Ref. [22], the value 
Ht ~ 0.94 ± 0.05 was suggested for window sizes above 
the crossover. 

Similarly to the case of traded value Hurst ex- 
ponents analyzed in Section IIIII the inclusion of 
more stocks |29j reveals the underlying systematic non- 
universality. Again, less frequently traded stocks ap- 
pear to have weaker autocorrelations as Ht decreases 
monotonously with growin g (T ). One can fit an approx- 
imate logarithmic law [3(j '|3l| to characterize the trend: 

H T = H* T + 7T log (T) , (7) 

where j T = -0.10 ± 0.02 for the period 1994 - 1995 (see 
Fig. 0) and 7t = -0.08 ± 0.02 for the year 2000 H3- 

In their recent preprint, Yuen and Ivanov |23j indepen- 
dently show a tendency similar to Eq. Q for intertrade 
times of NYSE and NASDAQ in a different set of stocks. 



V. CONCLUSIONS 

In this paper we have summarized a few recent ad- 
vances in understanding the role of company size in trad- 
ing dynamics. We revisited a number of previous studies 
of stock market data and found that the extension of the 
range of capitalization of the studied firms reveals a new 
aspect of stylized facts: The characteristics of trading 
display a fundamental dependence on capitalization. 

We have shown that trading activity (/}, the number 
of trades per minute (N) and the mean size of transac- 
tions (V) display non-trivial, monotonous dependence on 
company capitalization, which cannot be described by a 
simple power law. On the other hand, for moderate to 
large companies, a power law gives an acceptable fit for 
the dependence of the mean transaction size on the trad- 
ing frequency. 

The Hurst exponents for the variance of traded 
value/intertrade times can be defined and they depend 
logarithmically on the mean trading activity (/} /mean 
intertrade time (T). 
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Figure 3: Hurst exponents of T, for windows greater than 1 
day, plotted versus the mean intertrade time (Tj). Stocks that 
are traded less frequently, show markedly weaker persistence 
of T for time scales longer than 1 day. The dotted horizontal 
line serves as a reference. We used stocks with (T) < 10 sec, 
the sample period was 1994 — 1995. The inset shows the two 
regimes of correlation strength for the single stock General 
Electric (GE) on a log-log plot of cr(N) versus N. The slopes 
corresponding to Hurst exponents are 0.6 and 0.89. 



These findings imply that special care must be taken 
when the concepts of scaling and universality are applied 
to financial processes. For the modeling of stock market 
processes, one should always consider that many charac- 
teristic quantities depend strongly on the capitalization. 
The introduction of such models seems a real challenge 
at present. 
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